Abstract: To completely crawl the World Wide Web, web crawler takes more than a week period of time. This paper focuses on role of agents in providing intelligent crawling over the web. The role of building a proxy server at application level is clearly discussed. The web pages need to be cached for providing better response time. Within this time, there are changes occurred to various pages, so it cannot always be able to provide the updated content to the use. Intellect Webbot will reduce the latency taken for search results by enabling various agents, providing more updated links to the user, also the adeptness to view users’ bookmarks anywhere through our system. Moreover, this system has distributed intelligent agents, which is used to index the web pages in the server with the updated information. The actual scenario is the user going to give the keyword in terms of query to this system. The system contains several agents such as Link repository agent, Regional crawler agent, link maintenance agent, and bookmark agent. The results from this system are the list of URLs along with description about that page. The link in the result page is called context link. Forming the context link, based on the user given keyword and the related link that are available in the link repository, should be made and that would be included in the result page as a list of context links. Unlike other search engines, crawler provides context links to the user, according to the user’s pursuit. This work is accomplished by storing the users’ name along with their search history in the server. The dynamic web cache management scheme is being tested across 30 nodes and its results are discussed. The proposed intelligent crawler is compared with LLI and dynamic web cache scheme and results are discussed. The results achieved from these experiments confirm the efficiency and adaptability of the proposed crawler.
Keywords: Intellect Webbot, World Wide Web, Web Crawler, Intelligent Agent.